Unit-weighted Regression
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, unit-weighted regression is a simplified and
robust Robustness is the property of being strong and healthy in constitution. When it is transposed into a system, it refers to the ability of tolerating perturbations that might affect the system’s functional body. In the same line ''robustness'' ca ...
version ( Wainer & Thissen, 1976) of
multiple regression In statistical modeling, regression analysis is a set of statistical processes for Estimation theory, estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning ...
analysis where only the intercept term is estimated. That is, it fits a model :\hat = \hat(\mathbf) = \hat + \sum_i x_i where each of the x_i are binary variables, perhaps multiplied with an arbitrary weight. Contrast this with the more common multiple regression model, where each predictor has its own estimated coefficient: :\hat = \hat(\mathbf) = \hat + \sum_i \hat_i x_i In the
social science Social science is one of the branches of science, devoted to the study of societies and the relationships among individuals within those societies. The term was formerly used to refer to the field of sociology, the original "science of soc ...
s, unit-weighted regression is sometimes used for binary
classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood. Classification is the grouping of related facts into classes. It may also refer to: Business, organizat ...
, i.e. to predict a yes-no answer where \hat < 0 indicates "no", \hat \ge 0 "yes". It is easier to interpret than multiple linear regression (known as
linear discriminant analysis Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features ...
in the classification case).


Unit weights

Unit-weighted regression is a method of
robust regression In robust statistics, robust regression seeks to overcome some limitations of traditional regression analysis. A regression analysis models the relationship between one or more independent variables and a dependent variable. Standard types of reg ...
that proceeds in three steps. First, predictors for the outcome of interest are selected; ideally, there should be good empirical or theoretical reasons for the selection. Second, the predictors are converted to a standard form. Finally, the predictors are added together, and this sum is called the variate, which is used as the predictor of the outcome.


Burgess method

The Burgess method was first presented by the sociologist
Ernest W. Burgess Ernest Watson Burgess (May 16, 1886 – December 27, 1966) was a Canadian-American urban sociologist born in Tilbury, Ontario. He was educated at Kingfisher College in Oklahoma and continued graduate studies in sociology at the University of Ch ...
in a 1928 study to determine success or failure of inmates placed on parole. First, he selected 21 variables believed to be associated with parole success. Next, he converted each predictor to the standard form of zero or one (Burgess, 1928). When predictors had two values, the value associated with the target outcome was coded as one. Burgess selected success on parole as the target outcome, so a predictor such as a ''history of theft'' was coded as "yes" = 0 and "no" = 1. These coded values were then added to create a predictor score, so that higher scores predicted a better chance of success. The scores could possibly range from zero (no predictors of success) to 21 (all 21 predictors scored as predicting success). For predictors with more than two values, the Burgess method selects a cutoff score based on subjective judgment. As an example, a study using the Burgess method (Gottfredson & Snyder, 2005) selected as one predictor the number of complaints for delinquent behavior. With failure on parole as the target outcome, the number of complaints was coded as follows: "zero to two complaints" = 0, and "three or more complaints" = 1 (Gottfredson & Snyder, 2005. p. 18).


Kerby method

The Kerby method is similar to the Burgess method, but differs in two ways. First, while the Burgess method uses subjective judgment to select a cutoff score for a multi-valued predictor with a binary outcome, the Kerby method uses classification and regression tree (
CART A cart or dray (Australia and New Zealand) is a vehicle designed for transport, using two wheels and normally pulled by one or a pair of draught animals. A handcart is pulled or pushed by one or more people. It is different from the flatbed tr ...
) analysis. In this way, the selection of the cutoff score is based not on subjective judgment, but on a statistical criterion, such as the point where the chi-square value is a maximum. The second difference is that while the Burgess method is applied to a binary outcome, the Kerby method can apply to a multi-valued outcome, because CART analysis can identify cutoff scores in such cases, using a criterion such as the point where the t-value is a maximum. Because CART analysis is not only binary, but also recursive, the result can be that a predictor variable will be divided again, yielding two cutoff scores. The standard form for each predictor is that a score of one is added when CART analysis creates a partition. One study (Kerby, 2003) selected as predictors the five traits of the
Big five personality traits The Big Five personality traits is a suggested taxonomy, or grouping, for personality traits, developed from the 1980s onward in psychological trait theory. Starting in the 1990s, the theory identified five factors by labels, for the US English ...
, predicting a multi-valued measure of
suicidal ideation Suicidal ideation, or suicidal thoughts, means having thoughts, ideas, or ruminations about the possibility of ending one's own life.World Health Organization, ''ICD-11 for Mortality and Morbidity Statistics'', ver. 09/2020MB26.A Suicidal ideatio ...
. Next, the personality scores were converted into standard form with CART analysis. When the CART analysis yielded one partition, the result was like the Burgess method in that the predictor was coded as either zero or one. But for the measure of neuroticism, the result was two cutoff scores. Because higher neuroticism scores correlated with more suicidal thinking, the two cutoff scores led to the following coding: "low Neuroticism" = 0, "moderate Neuroticism" = 1, "high Neuroticism" = 2 (Kerby, 2003).


''z''-score method

Another method can be applied when the predictors are measured on a continuous scale. In such a case, each predictor can be converted into a
standard score In statistics, the standard score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores above the mean ...
, or ''z''-score, so that all the predictors have a mean of zero and a standard deviation of one. With this method of unit-weighted regression, the variate is a sum of the ''z''-scores (e.g., Dawes, 1979; Bobko, Roth, & Buster, 2007).


Literature review

The first empirical study using unit-weighted regression is widely considered to be a 1928 study by sociologist
Ernest W. Burgess Ernest Watson Burgess (May 16, 1886 – December 27, 1966) was a Canadian-American urban sociologist born in Tilbury, Ontario. He was educated at Kingfisher College in Oklahoma and continued graduate studies in sociology at the University of Ch ...
. He used 21 variables to predict parole success or failure, and the results suggest that unit weights are a useful tool in making decisions about which inmates to parole. Of those inmates with the best scores, 98% did in fact succeed on parole; and of those with the worst scores, only 24% did in fact succeed (Burgess, 1928). The mathematical issues involved in unit-weighted regression were first discussed in 1938 by
Samuel Stanley Wilks Samuel Stanley Wilks (June 17, 1906 – March 7, 1964) was an American mathematician and academic who played an important role in the development of mathematical statistics, especially in regard to practical applications. Early life and edu ...
, a leading statistician who had a special interest in
multivariate analysis Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable. Multivariate statistics concerns understanding the different aims and background of each of the dif ...
. Wilks described how unit weights could be used in practical settings, when data were not available to estimate beta weights. For example, a small college may want to select good students for admission. But the school may have no money to gather data and conduct a standard multiple regression analysis. In this case, the school could use several predictors—high school grades, SAT scores, teacher ratings. Wilks (1938) showed mathematically why unit weights should work well in practice. Frank Schmidt (1971) conducted a simulation study of unit weights. His results showed that Wilks was indeed correct and that unit weights tend to perform well in simulations of practical studies.
Robyn Dawes Robyn Mason Dawes (July 23, 1936 – December 14, 2010) was an American psychologist who specialized in the field of human judgment. His research interests included human irrationality, human cooperation, intuitive expertise, and the United State ...
(1979) discussed the use of unit weights in applied studies, referring to the robust beauty of unit weighted models. Jacob Cohen also discussed the value of unit weights and noted their practical utility. Indeed, he wrote, "As a practical matter, most of the time, we are better off using unit weights" (Cohen, 1990, p. 1306). Dave Kerby (2003) showed that unit weights compare well with standard regression, doing so with a cross validation study—that is, he derived beta weights in one sample and applied them to a second sample. The outcome of interest was suicidal thinking, and the predictor variables were broad personality traits. In the cross validation sample, the correlation between personality and suicidal thinking was slightly stronger with unit-weighted regression (''r'' = .48) than with standard multiple regression (''r'' = .47). Gottfredson and Snyder (2005) compared the Burgess method of unit-weighted regression to other methods, with a construction sample of N = 1,924 and a cross-validation sample of N = 7,552. Using the Pearson point-biserial, the effect size in the cross validation sample for the unit-weights model was ''r'' = .392, which was somewhat larger than for logistic regression (''r'' = .368) and predictive attribute analysis (''r'' = .387), and less than multiple regression only in the third decimal place (''r'' = .397). In a review of the literature on unit weights, Bobko, Roth, and Buster (2007) noted that "unit weights and regression weights perform similarly in terms of the magnitude of cross-validated multiple correlation, and empirical studies have confirmed this result across several decades" (p. 693). Andreas Graefe applied an equal weighting approach to nine established multiple regression models for forecasting U.S. presidential elections. Across the ten elections from 1976 to 2012, equally weighted predictors reduced the forecast error of the original regression models on average by four percent. An equal-weights model that includes all variables provided calibrated forecasts that reduced the error of the most accurate regression model by 29% percent.


Example

An example may clarify how unit weights can be useful in practice. Brenna Bry and colleagues (1982) addressed the question of what causes drug use in adolescents. Previous research had made use of multiple regression; with this method, it is natural to look for the best predictor, the one with the highest beta weight. Bry and colleagues noted that one previous study had found that early use of alcohol was the best predictor. Another study had found that alienation from parents was the best predictor. Still another study had found that low grades in school were the best predictor. The failure to replicate was clearly a problem, a problem that could be caused by bouncing betas. Bry and colleagues suggested a different approach: instead of looking for the best predictor, they looked at the number of predictors. In other words, they gave a unit weight to each predictor. Their study had six predictors: 1) low grades in school, 2) lack of affiliation with religion, 3) early age of alcohol use, 4) psychological distress, 5) low self-esteem, and 6) alienation from parents. To convert the predictors to standard form, each risk factor was scored as absent (scored as zero) or present (scored as one). For example, the coding for low grades in school were as follows: "C or higher" = 0, "D or F" = 1. The results showed that the number of risk factors was a good predictor of drug use: adolescents with more risk factors were more likely to use drugs. The model used by Bry and colleagues was that drug users do not differ in any special way from non-drug users. Rather, they differ in the number of problems they must face. "The number of factors an individual must cope with is more important than exactly what those factors are" (p. 277). Given this model, unit-weighted regression is an appropriate method of analysis.


Beta weights

In standard multiple regression, each predictor is multiplied by a number that is called the ''beta weight'', ''regression weight'' or ''weighted regression coefficients'' (denoted βW or BW). The prediction is obtained by adding these products along with a constant. When the weights are chosen to give the best prediction by some criterion, the model referred to as a proper linear model. Therefore, multiple regression is a proper linear model. By contrast, unit-weighted regression is called an improper linear model.


Model specification

Standard multiple regression hinges on the assumption that all relevant predictors of the outcome are included in the regression model. This assumption is called model specification. A model is said to be specified when all relevant predictors are included in the model, and all irrelevant predictors are excluded from the model. In practical settings, it is rare for a study to be able to determine all relevant predictors a priori. In this case, models are not specified and the estimates for the beta weights suffer from omitted variable bias. That is, the beta weights may change from one sample to the next, a situation sometimes called the problem of the bouncing betas. It is this problem with bouncing betas that makes unit-weighted regression a useful method.


See also

*
Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...
*
Regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
*
Robust regression In robust statistics, robust regression seeks to overcome some limitations of traditional regression analysis. A regression analysis models the relationship between one or more independent variables and a dependent variable. Standard types of reg ...


References

*Bobko, P., Roth, P. L., & Buster, M. A. (2007). "The usefulness of unit weights in creating composite scores: A literature review, application to content validity, and meta-analysis". ''Organizational Research Methods'', volume 10, pages 689-709. * *Burgess, E. W. (1928). "Factors determining success or failure on parole". In A. A. Bruce (Ed.), ''The Workings of the Indeterminate Sentence Law and Parole in Illinois'' (pp. 205–249). Springfield, Illinois: Illinois State Parole Board
Google books
*Cohen, Jacob. (1990). "Things I have learned (so far)". ''American Psychologist'', volume 45, pages 1304-1312. *Dawes, Robyn M. (1979). "The robust beauty of improper linear models in decision making". ''American Psychologist'', volume 34, pages 571-582. .
archived pdf
* Gottfredson, D. M., & Snyder, H. N. (July 2005). ''The mathematics of risk classification: Changing data into valid instruments for juvenile courts''. Pittsburgh, Penn.: National Center for Juvenile Justice. NCJ 209158
Eric.ed.gov pdf
*Kerby, Dave S. (2003). "CART analysis with unit-weighted regression to predict suicidal ideation from Big Five traits". ''Personality and Individual Differences'', volume 35, pages 249-261. *Schmidt, Frank L. (1971). "The relative efficiency of regression and simple unit predictor weights in applied differential psychology". ''Educational and Psychological Measurement'', volume 31, pages 699-714. *Wainer, H., & Thissen, D. (1976). Three steps toward robust regression. ''Psychometrika'', volume 41(1), pages 9–34. *


Further reading

*Dana, J., & Dawes, R. M. (2004). "The superiority of simple alternatives to regression for social science predictions". ''
Journal of Educational and Behavioral Statistics The ''Journal of Educational and Behavioral Statistics'' is a peer-reviewed academic journal published by SAGE Publications on behalf of the American Educational Research Association and American Statistical Association. It covers statistical met ...
'', volume 29(3), pages 317-331. *Dawes, R. M., & Corrigan, B. (1974). Linear models in decision making. ''
Psychological Bulletin The ''Psychological Bulletin'' is a monthly peer-reviewed academic journal that publishes evaluative and integrative research reviews and interpretations of issues in psychology, including both qualitative (narrative) and/or quantitative (meta-anal ...
'', volume 81, pages 95–106. *Einhorn, H. J., & Hogarth, R. M. (1975). Unit weighting schemes for decision making. ''Organizational Behavior and Human Performance'', volume 13(2), pages 171-192. *Hakeem, M. (1948). The validity of the Burgess method of parole prediction. ''American Journal of Sociology'', volume 53(5), pages 376-386
JSTOR
*Newman, J. R., Seaver, D., Edwards, W. (1976). Unit versus differential weighting schemes for decision making: A method of study and some preliminary results. Los Angeles, CA: Social Science Research Institute
archived pdf
*Raju, N. S., Bilgic, R., Edwards, J. E., Fleer, P. F. (1997). Methodology review: Estimation of population validity and cross-validity, and the use of equal weights in prediction. ''Applied Psychological Measurement'', volume 21(4), pages 291-305. *Ree, M. J., Carretta, T. R., & Earles, J. A. (1998). "In top-down decisions, weighting variables does not matter: A consequence of Wilk's theorem." ''Organizational Research Methods'', volume 1(4), pages 407-420.
archived pdf
*Wainer, H. (1978). On the sensitivity of regression and regressors. ''
Psychological Bulletin The ''Psychological Bulletin'' is a monthly peer-reviewed academic journal that publishes evaluative and integrative research reviews and interpretations of issues in psychology, including both qualitative (narrative) and/or quantitative (meta-anal ...
'', volume 85(2), pages 267-273. {{doi, 10.1037/0033-2909.85.2.267


External links


Chis Stucchio blog
- Why a pro/con list is 75% as good as your fancy
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
algorithm Regression analysis